AITopics | neural text generation

A Contrastive Framework for Neural Text Generation

Neural Information Processing SystemsDec-24-2025, 16:52:14 GMT

Text generation is of great importance to many natural language processing applications. However, maximization-based decoding methods (e.g., beam search) of neural language models often lead to degenerate solutions---the generated text is unnatural and contains undesirable repetitions. Existing approaches introduce stochasticity via sampling or modify training objectives to decrease the probabilities of certain tokens (e.g., unlikelihood training). However, they often lead to solutions that lack coherence. In this work, we show that an underlying reason for model degeneration is the anisotropic distribution of token representations. We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model's representation space, and (ii) a decoding method---contrastive search---to encourage diversity while maintaining coherence in the generated text. Extensive experiments and analyses on three benchmarks from two languages demonstrate that our proposed approach outperforms state-of-the-art text generation methods as evaluated by both human and automatic metrics.

contrastive framework, name change, neural text generation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

Neural Information Processing SystemsDec-23-2025, 19:41:30 GMT

While large-scale neural language models, such as GPT2 and BART,have achieved impressive results on various text generation tasks, they tend to get stuck in undesirable sentence-level loops with maximization-based decoding algorithms (\textit{e.g.}, greedy search). This phenomenon is counter-intuitive since there are few consecutive sentence-level repetitions in the human corpus (e.g., 0.02\% in Wikitext-103). To investigate the underlying reasons for generating consecutive sentence-level repetitions, we study the relationship between the probability of repetitive tokens and their previous repetitions in context. Through our quantitative experiments, we find that 1) Models have a preference to repeat the previous sentence; 2) The sentence-level repetitions have a \textit{self-reinforcement effect}: the more times a sentence is repeated in the context, the higher the probability of continuing to generate that sentence; 3) The sentences with higher initial probabilities usually have a stronger self-reinforcement effect. Motivated by our findings, we propose a simple and effective training method \textbf{DITTO} (Pseu\underline{D}o-Repet\underline{IT}ion Penaliza\underline{T}i\underline{O}n), where the model learns to penalize probabilities of sentence-level repetitions from synthetic repetitive data. Although our method is motivated by mitigating repetitions, our experiments show that DITTO not only mitigates the repetition issue without sacrificing perplexity, but also achieves better generation quality. Extensive experiments on open-ended text generation (Wikitext-103) and text summarization (CNN/DailyMail) demonstrate the generality and effectiveness of our method.

analyzing and mitigating repetition, repetition, sentence-level repetition, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Appendix of ' Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation '

Neural Information Processing SystemsOct-2-2025, 12:29:10 GMT

We calculate it for each sequence x and average over the whole corpus. When decoding auto-regressively, the probabilities of the repetitive sentence loops also have a self-reinforcement effect. As shown in Figure 2, the probability of the token'located' increases almost The work was conducted in Apple. Here we use the end token to split sentences for ease of experiments. We present the probability of the token'located' ( y-axis) as the number of historical repetitions Best viewed in color and zoomed in a desktop monitor.

artificial intelligence, natural language, repetition, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > Ohio (0.04)
North America > United States > Missouri > Buchanan County > Saint Joseph (0.04)
(6 more...)

Industry:

Media > Film (1.00)
Government (1.00)
Leisure & Entertainment > Sports > Basketball (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

A Contrastive Framework for Neural Text Generation

Neural Information Processing SystemsJan-17-2025, 08:08:14 GMT

Text generation is of great importance to many natural language processing applications. However, maximization-based decoding methods (e.g., beam search) of neural language models often lead to degenerate solutions---the generated text is unnatural and contains undesirable repetitions. Existing approaches introduce stochasticity via sampling or modify training objectives to decrease the probabilities of certain tokens (e.g., unlikelihood training). However, they often lead to solutions that lack coherence. In this work, we show that an underlying reason for model degeneration is the anisotropic distribution of token representations.

contrastive framework, neural text generation, training objective, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation

Neural Information Processing SystemsOct-9-2024, 21:25:03 GMT

While large-scale neural language models, such as GPT2 and BART,have achieved impressive results on various text generation tasks, they tend to get stuck in undesirable sentence-level loops with maximization-based decoding algorithms (\textit{e.g.}, greedy search). This phenomenon is counter-intuitive since there are few consecutive sentence-level repetitions in the human corpus (e.g., 0.02\% in Wikitext-103). To investigate the underlying reasons for generating consecutive sentence-level repetitions, we study the relationship between the probability of repetitive tokens and their previous repetitions in context. Through our quantitative experiments, we find that 1) Models have a preference to repeat the previous sentence; 2) The sentence-level repetitions have a \textit{self-reinforcement effect}: the more times a sentence is repeated in the context, the higher the probability of continuing to generate that sentence; 3) The sentences with higher initial probabilities usually have a stronger self-reinforcement effect. Motivated by our findings, we propose a simple and effective training method \textbf{DITTO} (Pseu\underline{D}o-Repet\underline{IT}ion Penaliza\underline{T}i\underline{O}n), where the model learns to penalize probabilities of sentence-level repetitions from synthetic repetitive data.

analyzing and mitigating repetition, repetition, sentence-level repetition, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Generating Human-level Text with Contrastive Search in Transformers 🤗

#artificialintelligenceDec-18-2022, 18:10:13 GMT

In this blog, we introduce the current state-of-the-art decoding method, Contrastive Search, for neural text generation. Contrastive search is originally proposed in "A Contrastive Framework for Neural Text Generation" [1] ([Paper][Official Implementation]) at NeurIPS 2022. Moreover, in this follow-up work, "Contrastive Search Is What You Need For Neural Text Generation" [2] ([Paper] [Official Implementation]), the authors further demonstrate that contrastive search can generate human-level text using off-the-shelf language models across 16 languages. Contrastive Search is now available on transformers, both on PyTorch and TensorFlow. You can interact with the examples shown in this blog post using your framework of choice in this colab notebook, which is linked at the top.

contrastive search, greedy search, language model, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Neural Text Generation with Part-of-Speech Guided Softmax

Yang, Zhixian, Wan, Xiaojun

arXiv.org Artificial IntelligenceMay-8-2021

Neural text generation models are likely to suffer from the low-diversity problem. Various decoding strategies and training-based methods have been proposed to promote diversity only by exploiting contextual features, but rarely do they consider incorporating syntactic structure clues. In this work, we propose using linguistic annotation, i.e., part-of-speech (POS), to guide the text generation. In detail, we introduce POS Guided Softmax (POSG-Softmax) to explicitly model two posterior probabilities: (i) next-POS, and (ii) next-token from the vocabulary of the target POS. A POS guided sampling strategy is further proposed to address the low-diversity problem by enriching the diversity of POS. Extensive experiments and human evaluations demonstrate that, compared with existing state-of-the-art methods, our proposed methods can generate more diverse text while maintaining comparable quality.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2105.03641

Country: